Drawing from the resources of psychoanalysis and critical media studies, in this paper we develop an analysis of Large Language Models (LLMs) as automated subjects. We argue the intentional fictional projection of subjectivity onto LLMs can yield an alternate frame through which AI behaviour, including its productions of bias and harm, can be analysed. First, we introduce language models, discuss their significance and risks, and outline our case for interpreting model design and outputs with support from psychoanalytic concepts. We trace a brief history of language models, culminating with the releases, in 2022, of systems that realise state-of-the-art natural language processing performance. We engage with one such system, OpenAI's InstructGPT, as a case study, detailing the layers of its construction and conducting exploratory and semi-structured interviews with chatbots. These interviews probe the model's moral imperatives to be helpful, truthful and harmless by design. The model acts, we argue, as the condensation of often competing social desires, articulated through the internet and harvested into training data, which must then be regulated and repressed. This foundational structure can however be redirected via prompting, so that the model comes to identify with, and transfer, its commitments to the immediate human subject before it. In turn, these automated productions of language can lead to the human subject projecting agency upon the model, effecting occasionally further forms of countertransference. We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems.
translated by 谷歌翻译
As text generated by large language models proliferates, it becomes vital to understand how humans engage with such text, and whether or not they are able to detect when the text they are reading did not originate with a human writer. Prior work on human detection of generated text focuses on the case where an entire passage is either human-written or machine-generated. In this paper, we study a more realistic setting where text begins as human-written and transitions to being generated by state-of-the-art neural language models. We show that, while annotators often struggle at this task, there is substantial variance in annotator skill and that given proper incentives, annotators can improve at this task over time. Furthermore, we conduct a detailed comparison study and analyze how a variety of variables (model size, decoding strategy, fine-tuning, prompt genre, etc.) affect human detection performance. Finally, we collect error annotations from our participants and use them to show that certain textual genres influence models to make different types of errors and that certain sentence-level features correlate highly with annotator selection. We release the RoFT dataset: a collection of over 21,000 human annotations paired with error classifications to encourage future work in human detection and evaluation of generated text.
translated by 谷歌翻译
Multimodal integration of text, layout and visual information has achieved SOTA results in visually rich document understanding (VrDU) tasks, including relation extraction (RE). However, despite its importance, evaluation of the relative predictive capacity of these modalities is less prevalent. Here, we demonstrate the value of shared representations for RE tasks by conducting experiments in which each data type is iteratively excluded during training. In addition, text and layout data are evaluated in isolation. While a bimodal text and layout approach performs best (F1=0.684), we show that text is the most important single predictor of entity relations. Additionally, layout geometry is highly predictive and may even be a feasible unimodal approach. Despite being less effective, we highlight circumstances where visual information can bolster performance. In total, our results demonstrate the efficacy of training joint representations for RE.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
有限混合物建模是聚类领域的一种流行方法,并且在很大程度上是由于其软聚类成员资格概率所致。但是,EM算法是适合有限混合模型的最常见算法,是许多问题的受害者。我们解决了使用有限混合模型的困扰聚类的这些问题,包括在高维情况下与局部最大值和算法速度问题相对应的解决方案的收敛。这是通过开发两种新型算法来完成的,这些算法结合了数据矩阵的光谱分解和非参数bootstrap采样方案。模拟显示了我们的算法的有效性,不仅证明了它们的灵活性,而且还证明了与其他(自举)聚类算法相比,它们避免了与局部墨西哥相对应的溶液的能力。我们的新型算法通常具有更一致的收敛标准,并且在适合有限混合模型的其他自举算法中,速度显着提高。
translated by 谷歌翻译
本文考虑了从野外单视图像中无监督的3D对象重建的问题。由于歧义性和内在的不良性,这个问题本质上难以解决,因此需要强大的正则化以实现不同潜在因素的分离。与现有的作品将明确的正规化引入目标功能不同,我们研究了一个不同的空间进行隐式正则化 - 潜在空间的结构。具体而言,我们限制了潜在空间的结构,以捕获潜在因素的拓扑因果排序(即代表因果关系作为定向无环形图)。我们首先表明,不同的因果顺序对于3D重建至关重要,然后探索几种方法以找到与任务有关的因果因素排序。我们的实验表明,潜在空间结构确实是隐式正规化,并引入了有益于重建的电感偏见。
translated by 谷歌翻译
公平的机器学习研究人员(ML)围绕几个公平标准结合,这些标准为ML模型公平提供了正式的定义。但是,这些标准有一些严重的局限性。我们确定了这些正式公平标准的四个主要缺点,并旨在通过扩展性能预测以包含分配强大的目标来帮助解决这些问题。
translated by 谷歌翻译
安全可靠的自治解决方案是下一代智能运输系统的关键组成部分。这种系统中的自动驾驶汽车必须实时考虑复杂而动态的驾驶场景,并预测附近驾驶员的行为。人类驾驶行为非常细微,对个别交通参与者具有特殊性。例如,在合并车辆的情况下,驾驶员可能会显示合作或非合作行为。这些行为必须估算并纳入安全有效驾驶的计划过程中。在这项工作中,我们提出了一个框架,用于估计高速公路上驾驶员的合作水平,并计划将动作与驾驶员的潜在行为合并。潜在参数估计问题使用粒子滤波器解决,以近似合作级别的概率分布。包括潜在状态估算的部分可观察到的马尔可夫决策过程(POMDP)在线解决,以提取合并车辆的政策。我们在高保真汽车模拟器中评估我们的方法,以对潜在状态不可知或依赖于$ \ textit {a先验{先验} $假设。
translated by 谷歌翻译
3D场景图(3DSG)是新兴的描述;统一符号,拓扑和度量场景表示。但是,典型的3DSG即使在小环境中包含数百个对象和符号。完整图上的任务计划是不切实际的。我们构建任务法,这是第一个大规模的机器人任务计划基准3DSGS。尽管大多数基准在该领域的基准努力都集中在基于愿景的计划上,但我们系统地研究了符号计划,以使计划绩效与视觉表示学习相结合。我们观察到,在现有方法中,基于经典和学习的计划者都不能在完整的3DSG上实时计划。实现实时计划需要(a)稀疏3DSG进行可拖动计划的进展,以及(b)设计更好利用3DSG层次结构的计划者。针对前一个目标,我们提出了磨砂膏,这是一种由任务条件的3DSG稀疏方法。使经典计划者能够匹配,在某些情况下可以超过最新的学习计划者。我们提出寻求后一个目标,这是一种使学习计划者能够利用3DSG结构的程序,从而减少了当前最佳方法所需的重型查询数量的数量级。我们将开放所有代码和基线,以刺激机器人任务计划,学习和3DSGS的交叉点进行进一步的研究。
translated by 谷歌翻译
处理稀疏奖励是强化学习(RL)的长期挑战。事后观察经验重播(她)通过重复使用失败的轨迹作为一个目标作为另一个目标的轨迹来解决这个问题。这允许最低奖励密度和跨多个目标的概括。但是,已知这种策略会导致偏见的价值函数,因为更新规则低估了随机环境中不良结果的可能性。我们提出了一种基于重要性抽样算法的渐近无偏见,以解决此问题,而无需在确定性环境上牺牲绩效。我们在一系列机器人系统上展示了其有效性,包括挑战高维随机环境。
translated by 谷歌翻译